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(57) ABSTRACT 

A real-time receiver and method for receiving and playing 
out real-time packetized data are disclosed. The receiver 
includes a packet transmission fixed delay estimator and a 
packet transmission variable delay estimator. The fixed 
delay estimator determines, using packets received up to the 
current point in a conference, the non- variable portion of 
observed delays. This non-variable portion is subtracted 
from each packet's observed delay to obtain a variable delay 
estimate for that packet. 

Since variable delays actually drive the buffering time 
needed at the receiver to achieve smooth playout, the packet 
variable delay estimates can be used directly to adjust 
playout delay. Adaptive playout delay is preferably set 
aggressively low, based on observed packet variable delay 
estimates, to reduce data latency. Playout delay can be 
adjusted rapidly upwards when higher packet delays are 
observed, allowing rapid adaptation to network statistical 
variations and reducing the frequency of late packets. 

18 Claims, 6 Drawing Sheets 
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CLOCK SYNCHRONIZATION AND 
DYNAMIC JITTER MANAGEMENT FOR 
VOICE OVER IP AND REAL-TIME DATA 

FIELD OF THE INVENTION 

This invention pertains generally to methods and systems 
for communication of real-time audio, video, and data 
signals over a packet-switched data network, and more 
particularly to methods and systems for managing real-time 
data packet receipt and playout in the presence of variable 
packet delays. 

BACKGROUND OF THE INVENTION 

Most data networks are packet-switched. Data is commu- 
nicated over a packet-switched network in small chunks, or 
"packets", which require no dedicated circuit. Each packet 
contains information that allows the data network to route it 
to the appropriate destination. Packets from many different 
senders travel sequentially over single connections between 
routing points, and packets from the same sender may travel 
different routes as network conditions change. 
Consequently, consecutive packets from a specific sender to 
a specific receiver may experience different delays as they 
travel different routes or experience different competing 
traffic loads along the network. 

Researchers have sought ways to communicate real-time 
information over packet-switched data networks in order to 
take advantage of the time-varying nature and information 
redundancies found in most real-time data. For example, it 
is now possible to route voice telephone traffic over data 
networks through a technique commonly referred to as 
"Voice Over IF', or "VoIP" for short. VoIP can require 
significantly less average bandwidth than a traditional 
circuit-switched connection for several reasons. First, by 
detecting when voice activity is present, VoIP can choose to 
send little or no data when a speaker on one end of a 
conversation is silent, whereas a conventional, circuit- 
switched telephone connection continues to transmit during 
periods of silence. Second, the digital audio bitstream uti- 
lized by VoIP may be significantly compressed before trans- 
mission using a codec (compression/decompression) 
scheme. Using current technology, a telephone conversation 
that would require two 64 kbps (one each way) channels 
over a circuit-switched network may utilize a data rate of 
roughly 8 kbps with VoIP. 

The variation in packet arrival rate, or "jitter", existing on 
most packet networks, presents challenges for real-time 
communication. To compensate for jitter, a real-time 
receiver must buffer packets for an amount of time sufficient 
to allow orderly, regular playout of the packets. Researchers 
have long recognized the need for an accurate method of 
receiver playout buffer length selection in real-time packet 
data communications such as VoIP. If the buffer delay is too 
short, "slower" packets will not arrive before their desig- 
nated playout time and playout quality suffers. If the buffer 
delay is too long, it noticeably disrupts interactive commu- 
nications. Selection of a near-optimal packet buffer delay for 
real-time communications requires accurate knowledge of 
actual packet delays. 

Various protocols have been suggested for allowing 
receivers to obtain delay information. These include two 
described by W. Montgomery, "Techniques for Packet Voice 
Synchronization", IEEE J. on Selected Areas in Comm., vol. 
SAC-1, No. 6, pp. 1022-1028, Dec. 1983. One protocol uses 
an absolute clock reference by both a sender and a receiver. 
The sender timestamps each packet, and the receiver com- 
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pares the timestamps on packets it receives to the absolute 
clock reference to determine delay. A second protocol would 
require that each packet switch along the network update a 
packet delay field to include the amount of time the packet 

5 was delayed by the switch. Since switches are the major 
source of variations in delay, the receiver can estimate delay 
by examining the delay field in received packets. 

Unfortunately, neither of the protocols mentioned above 
are in widespread use today. Instead, most real-time packet 

]0 data transmissions utilize the Real-time Transport Protocol 
(RTP). A sender using this protocol includes a packet 
timestamp generated from a local clock. The clock rate used 
to generate consecutive RTP timestamps is the clock rate of 
the data being transmitted — thus two consecutive packets 
should carry timestamps that differ by the number of data 

15 samples contained in the first of the two packets. Although 
RTP timestamps allow a receiver to reassemble samples in 
correct order, they contain no absolute delay information 
because the sender and receiver local clocks are not syn- 
chronized. 

20 Despite the lack of absolute delay information in RTP 
headers, researchers have found ways to use adaptive, rather 
than fixed, buffer delays with RTP data streams. Although a 
fixed playout buffer delay can work in some circumstances 
(particularly with real-time communication over local area 

25 networks), adaptive playout buffer delay methods will gen- 
erally perform better over a range of network conditions. An 
adaptive method attempts to minimize delay for current 
network conditions. Most techniques for adaptively adjust- 
ing buffer delay base their adjustments on statistics gleaned 

30 from RTP (or similar) timestamp histories. Four such tech- 
niques are discussed in R. Ramjee, et al., "Adaptive Playout 
Mechanisms for Packetized Audio Applications in Wide- 
Area Networks" in Proceedings of the Conference on Com- 
puter Communications (IEEE Infocom), (Toronto, Canada), 

35 pp. 680-688, June 1994. 

Each technique discussed in Ramjee et al. computes a 
delay estimate d ( and a delay variation v ; for each packet i. 
The basic adaptive algorithm is illustrated in FIG. 1. A 
packet i, containing a timestamp ts,- affixed to packet i by the 

4 ° sender, is received from packet-switched network 20 by 
receiver 16. Summer 24 subtracts timestamp ts. from a 
receive timestamp tr ( -, taken from receiver clock reference 
22, to produce a difference sample n f . With RTP, this 
difference will include an offset equal to the difference 

45 between the sender and receiver clock references. First- 
order filter 26 computes a mean delay estimate d ( . from 
difference samples n t .. Summer 28 feeds the absolute value 
of the difference between d t - and n,- to a second filter 30, 
which uses these samples to create a filtered estimate of the 

50 variation in delay v,-. Multiplier 32 produces a multiple k of 
v ( , which summer 34 adds to d ( and ts, to create a playout 
time p i for packet i. 

Ramjee et al.'s other three discussed methods comprise 
various heuristic adaptations of the adaptive playout delay 

55 estimator of FIG. 1. One adaptation uses different time 
constants for filter 26, depending on whether the latest 
measurement n, will increase or decrease delay estimate d ( . 
Another adaptation suspends delay estimate filtering tem- 
porarily if it detects a "spike" in the packet arrival rate, A 

60 fourth algorithm dispenses with filter 26 altogether, by 
examining all n, computed for the last talkspurt received and 
setting d ( - to the minimum of these values for the next 
talkspurt. 

65 SUMMARY OF THE INVENTION 

The present invention provides a packet-based real-time 
communication system utilizing an adaptive jitter manage- 
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ment system to reduce buffer latency while avoiding jitter 
underflow (jitter underflow occurs when the playout buffer 
runs out of data to playout). This system seeks to overcome 
several deficiencies in prior art adaptive systems, thereby 
providing increased performance over a wide variety of 5 
network conditions. 

Variation in packet delay is not a stationary process. 
Despite this, most prior art algorithms attempt to estimate 
packet delay statistics with time-based estimates such as 
mean arrival time and variance from mean arrival time. Such 10 
algorithms tend to under perform at startup, as well as when 
packet delay statistical transitions occur. Furthermore, a 
"mean+rule-of-thumb times variance" playout estimate 
must be keyed to assumptions about the expected distribu- 
tion of packet delays — because these assumptions will not 15 
always hold, the rule-of- thumb must be set conservatively. 
The prior art has attempted to cope with these deficiencies 
through a variety of heuristic adaptations, such as the 
statistical anomaly "spike" detector discussed in Ramjee et 
al. It is recognized herein that statistical estimation tech- 20 
niques are generally ill-suited for adaptive jitter control over 
a time-variant network. 

The present invention avoids the statistical pitfalls of the 
prior art by basing playout buffer adjustments on the one 
stable statistic that exists in a packet-switched conference — 25 
fixed transmission delay. Instead of referencing statistical 
estimates to recent trends in the data, the present invention 
computes variable packet delays with reference to a mini- 
mum delay estimate valid for all received packets. The 
stability of the minimum delay statistic allows the present 30 
invention to accurately follow the jitter envelope of the 
variable packet delays and adjust playout time accordingly. 
A further benefit of the system is rapid convergence of the 
minimum delay statistic, which allows aggressive initial 
settings and good performance at connection startup. 35 

In one aspect of the present invention, a packet-based 
real-time data receiver comprises a packet transmission 
fixed delay estimator and a packet transmission variable 
delay estimator. The fixed delay estimator keeps track of 4Q 
fixed delay (including offsets) over the duration of a con- 
ference connection. When a conference packet is received 
prior to a minimum arrival time predicted for that packet by 
the current fixed delay estimate, the fixed delay estimate is 
adjusted downwards (i.e., a packet with lower than the 45 
predicted fixed delay has been received, therefore the fixed 
delay estimate was too high). The packet transmission 
variable delay estimator calculates a variable delay for each 
received packet. A minimum arrival time based on the fixed 
delay estimate is subtracted from the packet's actual arrival 5Q 
time by the variable delay estimator. 

The packet variable delays are preferably used by an 
adaptive playout delay estimator within the receiver. The 
adaptive playout delay estimator adapts packet playout delay 
in an attempt to reduce latency as much as possible without 55 
causing jitter underflow. In a preferred embodiment, this 
estimator performs a non-linear filter operation on the packet 
variable delays. The receiver may use the packet playout 
delay to control a playout buffer. 

In another aspect of the present invention, a method of 60 
receiving and playing packetized real-time data is disclosed. 
When a real-time conference is established over a packet- 
switched network, a packet transmission fixed delay esti- 
mate and a playout delay estimate are initialized. A packet 
delay is calculated for each packet as it is received. The fixed 65 
delay estimate is adjusted downwards if the fixed delay 
estimate is greater than the packet delay for the current 
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packet. A packet variable delay estimate is then obtained for 
the packet by subtracting the fixed delay estimate from the 
packet delay. 

Preferably, the method further comprises non-linear adap- 
tation of a playout delay estimate. In one embodiment, as 
packet variable delay estimates are calculated, they are 
filtered into the playout delay estimate using a non-linear 
gain filter. The gain of the filter is based on the ratio of the 
packet variable delay estimate to the playout delay estimate. 

BRIEF DESCRIPTION OF THE DRAWING 

The invention may be best understood by reading the 
disclosure with reference to the following figures: 

FIG. 1, which shows a block diagram of a prior art 
adaptive playout delay system. 

FIG. 2, which shows a breakdown of real-time data 
latency into its components on a timeline. 

FIGS. 3 and 4, which show probability of packet arrival 
as a function of packet send time for two network delay 
distributions. 

FIG. 5, which illustrates a packet-based real-time data 
receiver according to one embodiment of the present inven- 
tion. 

FIGS. 6 and 7, which illustrate two playout delay transfer 
functions useful with the present invention. 

FIGS. 8-11, which compare a prior art adaptive playout 
delay method to a method according to one embodiment of 
the invention for two packet arrival sequences. 

FIGS. 12 and 13, which illustrate performance of embodi- 
ments of the invention for skewed send and receive clock 
rates. 

DETAILED DESCRIPTION OF THE 
PREFERRED EMBODIMENTS 

The present invention generally applies to systems that 
receive real-time packet-switched data. Real-time data, as 
understood in the art, refers to data whose usefulness decays 
rapidly if delayed by more than a few seconds, such as 
interactive voice or video conferencing. One type of real- 
time data receiver that can employ the present invention is 
a computer connected to a packet network and either run- 
ning VoIP software on its microprocessor, or having spe- 
cialized VoIP hardware or firmware. The invention also 
applies to a data network telephony gateway. When a 
gateway operates as a receiver, it must buffer voice packets 
and output a continuous digital or analog stream onto a 
circuit-switched system. Other applicable systems include 
PBX equipment, packet network video or multimedia, and 
other real-time data delivery systems. 
Packet Arrival Time Distributions 

For real-time packet-switched data receivers, latency, i.e. 
the difference between packet send time and packet playout 
time, is of primary interest. With reference to FIG. 2, playout 
time t p for a given packet is related to t^ the time that the 
packet was constructed by the sender, by a concatenation of 
three delays. The first delay, dp represents the minimum 
travel time that a packet will incur in the network as it passes 
from sender to receiver. The second delay, d v , represents the 
variable delay incurred by a packet in the network, e.g., due 
to competition with other network traffic. A packet is actu- 
ally received at receive time t^tQ+d^d^ The receiver places 
the packet in a buffer until the designated playout time tp. 
The difference between playout time and receive time t r 
represents the buffer delay d b set by the receiver for that 
packet — if d b is set too low, t r may exceed tp for some 
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packets (i.e. late packets) and these packets will miss their 
playcmt time. Conversely, if d b is set too high, packets will 
wait unnecessarily long for playout. 

FIGS. 3 and 4 depict two probability distributions for 
packet arrival time, p^, as a function of t 0 , over the duration 5 
of a conference. FIG. 3 shows p pa as a Rayleigh distribution 
60, while FIG. 4 shows p pa as a uniform distribution 62. In 
both cases, the probability that a packet arrives prior to 
t^tg+dy is zero. With a fixed playout time t p , a few packets 
will arrive too late for playout if packets are distributed as 10 
shown by distribution 60. For distribution 62, all packets 
arrive well ahead of playout time t p . 

Most adaptive playout control systems attempt to estimate 
mean arrival time l a and arrival time variance \ a for p^. 
These systems generally set t^-t^+kv^, where k is a constant. 15 
As the system cannot know p^, it must set k conservatively 
(note that distributions 60 and 62, as shown, have the same 
mean arrival time). And since p is generally non- 
stationary, mean and variance may be difficult to estimate 
and track. Finally, variance itself contains some information 20 
of little value in setting buffer delay, i.e., information about 
the variation in packet arrival for packets that arrive before 
the mean arrival time (note that a minimally-delayed packet 
increases variance, thus increasing playout delay for such a 
system). 25 
Fixed Delay and Variable Delay Estimation 

The present invention abandons the concepts of mean 
arrival time and variance. Instead, an adaptive playout 
control system according to the invention estimates ip the 
fixed minimum arrival time for the conference. The fixed 30 
minimum arrival time is a stable statistic for all network 
packet arrival time distributions, both stationary and non- 
stationary. As will be shown, errors in the initial estimate of 
minimum arrival time can be corrected with no performance 
penalty. The preferred embodiments calculate packet jitter 35 
for each received packet as the difference between the 
minimum arrival time and the actual arrival time for that 
packet. Playout buffer delay is computed from packet jitter 
values. 

FIG. 5 depicts an adaptive packet-based real-time data 40 
receiver according 30 to one embodiment of the present 
invention. Receiver 54 receives packets i from packet data 
network 20, stores packet data in playout buffer 50, and 
relays the send timestamp ts,- from packet i to the adaptive 
circuitry. Playout buffer control 48 computes a playout time 45 
p,- for each packet i, and releases packets to playout device 
52 at their designated playout time. 

Summer 40 computes a raw packet delay n ( for each 
packet i as the difference between the send timestamp ts, and 
a receive timestamp tr ( . Generally, timestamps generated by 50 
the sending system and the receiving system are not syn- 
chronized. The present invention functions whether or not 
send and receive clocks are synchronized, although the 
remainder of the discussion assumes a lack of synchroniza- 
tion. Receive timestamp tr, is computed from a receive 55 
clock. The receive clock utilizes a reference clock source 
related to the real-time data rate; preferably, the timestamp 
is supplied by playout buffer control 48. Buffer control 48 
preferably increments a timestamp counter each time a 
sample or frame of data is output to playout device 52 — this 60 
counter is a convenient reference source for tr,-. 

Fixed delay estimator 42 uses raw packet delays n,- to 
compute a minimum packet delay estimate ddy, In its sim- 
plest form, fixed delay estimator 42 implements a floor 
function for all raw packet delays prior to and including raw 65 
delay for packet i, i.e., w ; |nj. This delay estimate is 
not a measure of absolute fixed delay, as it also contains the 



offset between the unsynchronized send and receive clocks 
(there is no mechanism to account for such a clock offset 
separately from a real fixed delay). Delay estimate dyin this 
embodiment thus represents the minimum clock offset 
observed over the conference up to packet i. 

Variable delay estimator calculates a packet jitter value j, 
for each packet i. Packet jitter value j f equals the estimated 
absolute variable delay for packet i. Packet jitter, or absolute 
variable delay, may be calculated by subtracting the clock 
offset and fixed delay (both contained in d^) from raw packet 
delay n,-. Packet jitter values are fed to adaptive playout 
delay estimator 46, which in turn feeds playout delay values 
to playout buffer control 48. 

Packet-based real-time data receiver 54 may advanta- 
geously be implemented as a programmed microprocessor 
or signal processor. Although machine -level programming is 
processor-specific, the following pseudocode may be 
adapted to a specific processor for use in an adaptive playout 
control system of a real-time data receiver. 



/* timestamp processing for each packet 7 
if (first packet) 



r initialization 7 



/* set minimum packet delay to delay of first sample 7 
flxed_delay « receive__clock - timestamp; 



} 

compute absolute variable delay for packet V 
packeL_jitter - receive_clock - timestamp - fixed_delay; 
/* if packet delay is less than current minimum, adjust minimum "/ 
if (packet__jittej < 0) 
{ 

fbced_delay = receive_clock - timestamp; 
packet_jitter = 0; 

} 



This code initializes the fixed delay estimate with a first 
timestamp difference. A packet jitter value is computed for 
each packet by subtracting the fixed delay from the times- 
tamp difference for that sample. A negative packet jitter 
value indicates that the packet arrived before the minimum 
arrival time predicted by the current fixed delay estimate. In 
such a case, the fixed delay estimate is set to the timestamp 
difference of the new packet, and that packet's jitter is reset 
to zero. 

Several safety measures may also be implemented in the 
above pseudocode. For instance, packets received out of 
sequence or otherwise suspect may be allowed to adjust 
packet_jitter in only small increments, e.g., one frame. 
Packets received very late may be marked so that they will 
not affect playout delay estimates at all. However, a long 
sequence (e.g., 8 packets) of consecutive very late packets 
may signify that an error has occurred that requires a reset 
of the adaptive playout system. 

Playout Delay Estimation Using Variable Packet Delays 

Jitter values as computed above are constrained to a 
time- varying envelope of arrival times bounded below by 
the fixed delay. The upper bound of this envelope must be set 
high enough to achieve acceptable late packet rates — for 
instance, for the ITU G.729 voice codec, voice quality 
degradation becomes noticeable if more than about 1.0% of 
transmitted voice packets miss their scheduled playout time. 
At the same time, talkspurLs should generally be played out 
as soon as possible, dictating that the upper bound of the 
envelope adapt to recent packet jitter values. 

A preferred embodiment of the invention includes a 
playout delay estimator — essentially, such an estimator 
adjusts an estimate of the upper bound of packet arrival 
times by comparing the current upper bound to measured 
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packet jitter values. A simple estimator operating on this remain high. This occurs not only because of the high initial 

principle adjusts delay by filtering a constant multiple k of mean estimate, but because the low-delayed packets (i.e. 

observed jitter values. This delay estimate d„ based on packets 4, 7, U, 12) actually increase playout delay esti- 

packet i and previous delay estimate d^;, may be expressed mates 72 and 73 because they vary from the mean by a 

as 5 relatively large (although negative) amount. As a result, 

playout of the latter portion of the sequence may be delayed 
2 to 5 frames longer than actually required for the sequence. 
rf^arf^+C-aW, FIG. 9 shows the same packet variable delay sequence, 
This estimator functions acceptably when used with tela- alon g ^ ' a delay estimate 74 and two playout delay 
lively time-table packet arrival distributions having a low 10 cs,lmates ™- 77 accordl °S t0 «S* ,0 f n,ei *f of the present 
probability of j^kd,. y FIG. 6 illustrates an envelope esti- "™nUon Like mean estimate 70 above fixed delay esti- 
mator transfer function, having a nonlinear gain, that is raal f 74 , s «f» off badly in error because of the high delay of 
particularly preferred for time-variant packet arrival distri- P a( * 6t J- •«* P acket Wlth a . smaJl " delav P' evi : 
Lions. No filter adjustment occurs with this filter for packet ^ observed packets arrives (i.e packed 2 4, 7), fixed 
i if the ratio is delay estimate 74 tracks downward towards the true fixed 

delay. From packet 7 on, estimate 74 represents the true 
fixed delay of the connection. 

Playout delay 76 follows the 5-region non-linear gain 
jitter filter methodology set out in FIG. 7 and in the section 

20 above for playout delay estimate d,. The embodiment rep- 

As the ratio of packet jitter to delay estimate varies away resented by delay 76 uses compensation to avoid direct 

from 1/k, the filter gain increases non-linearly, thus allowing mirroring of changes in fixed delay estimate 74 in playout 

the estimator to better track sudden variations in the arrival delay estimate 76. For instance, at packet two fixed delay 

time upper bound. estimate adjusts downwards two frames. Delay estimate d ( - is 

In one embodiment, such a nonlinear estimator is approxi- 25 adjusted upwards two frames at this point in compensation, 

mated by applying different filters at different ranges, or such that playout delay 76 does not track fixed delay 

zones, of the ratio of j,- to d^j. The following filter selection estimate 74 directly. Playout delay 76 accurately mirrors 

approximates nonlinear filtering with k=1.6 and avoids trends in packet delay over the sequence, while providing a 

direct ratioing by division, instead comparing j, to binary- one to two frame cushion. 

shifted versions of d^j. 30 Curve 77 represents playout delay calculated using a 

second embodiment of the invention. This embodiment 

'aidi-i +(L -a\)kji ji <0.25</ M differs from the embodiment producing delay 76 in that it 

azdj-! +■ (i -a2)kjj 0.254_i s j) < 0.5G4-I does not compensate d ( for downward shifts in fixed delay 

4. _ . 0.504-i £ ji < 0.754_, 74. Thus at packet 2, playout delay 77 tracks the two-frame 

a ^ d ^ q 7W l £j. < j. l 35 adjustment in fixed delay 74, placing it lower than the actual 

a ^ d ^ delay of packet 3. This causes the delay estimator to sharply 

4 1-1 ,_1 ' increase d, at packet 3, although playout delay 77 drops 

again at packet 4 due to another adjustment in fixed delay 74. 

Gain factor settings used in one embodiment of the Once fixed delay 74 stabilizes, curve 77 should begin to 

invention allows binary shifts and adds to be substituted for 40 converge with curve 76. 

multiplies and divides; e.g., a-,-1-2" 9 ^-!^" 11 , a 3 -l+ FIGS. 10 and 11 illustrate a second packet arrival 

2~ 6 , and a 4 «l+2" 2 for 20 msec packet sizes. This transfer sequence. FIG. 10 illustrates performance for prior art 

function is illustrated in FIG. 7. One characteristic of this adaptive delay estimator 16. Packet 1 experiences a rela- 

setting is a quick envelope response to jitter values that lively low delay, forcing a low initial mean estimate 78. 

approach or exceed delay estimate d,- (e.g., a 25% increase 45 Other packets with low delay (packets 2, 4, 8, 11, 12, 13) 

in d ( for jitter to delay estimate ratios greater than one). In negatively affect growth of playout delay 80 because of their 

contrast, the envelope responds relatively slowly to small low variance. Consequently, packet 3 arrives at the current 

jitter values. This behavior is desirable as it allows large playout estimate, and packets 6, 7, 9, 10, and 14 arrive too 

jitter values a heavier weighting in the calculation of delay late for their estimated playout time with k=2 (curve 80). 

estimate d,-. 50 Playout delay estimate 81, with k~4, appears adequate, 

Playout Delay Estimation Examples although this appearance is largely due to the low mean 

FIGS. 8-11 compare the response of a prior art mean/ estimate, 

variance delay estimator to the response of a delay estimator FIG. 11 shows the same packet arrival sequence asFIG. 

according to the invention, for two sequences of variable 10, this time using fixed delay adjustment-compensating 

packet delay. FIGS. 8 and 9 illustrate a first packet delay 55 (curve 84) and non-compensating (curve 86) embodiments 

sequence (packet delays represented as circles). In these as described in the description accompanying FIG. 9. Fixed 

figures, the vertical baseline is the true fixed delay for the delay estimate 82 adjusts once, at packet 4, where the 

sequence. minimum clock offset observed over the packet sequence 

FIG. 8 illustrates the response of a prior art receiver 16 as occurs. Playout delay estimates 84 and 86 adjust rapidly to 

in FIG. 1 to the packet delay sequence. Curve 70 plots the 60 envelop the numerous long -delay samples in this sequence, 

mean estimate calculated by receiver 16, and curves 72 and After packet 4, playout delay estimates 84 and 86 begin to 

73 show two playout delay estimates. Curve 72 uses a converge. 

variance multiplier k=2, while 73 uses k»4 as discussed in FIGS. 8 through 11 illustrate different startup scenarios 

Ramjee et al. Packet 1 of the sequence experiences a that an adaptive playout delay estimator may encounter. But 

relatively high variable delay, resulting in a high initial 65 such scenarios also represent statistical shifts in the packet 

estimate for mean 70. As packet delays decrease towards the arrival time distribution that may occur mid-conference. The 

end of the sequence, playout delay estimates 72 and 73 minimum delay estimate of the invention provides a solid 
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reference during these shifts from which playout delay may 
be adjusted. As a result, the present invention rapidly detects 
and adjusts to increasing packet delays. Generally, this 
allows the present invention to maintain a more aggressive 
playout schedule than prior art systems. 

Although receiver 54 preferably adjusts playout delay 
with every incoming data packet, the estimate preferably 
does not affect playout from buffer 50 (FIG. 5) at every 
frame. Playout buffer control 48 utilizes the output of 
envelope estimator 46 to adjust delay only at the beginning 
of each talkspurt. Effectively, playout delay is modulated by 
shrinking or stretching the amount of time between con- 
secutive talkspurts, . 

Compensating for Statistical Shifts in Fixed Delay 

According to the present invention, a real-time packet 
receiver bases buffer length and playout delay on a fixed 
delay estimate. Problems may arise if this fixed delay is not 
truly "fixed" over the duration of a conference. The most 
common example of this is where the send clock and receive 
clock operate at slightly different rates, resulting in a con- 
stant bias rate in the computed packet timestamp differences. 
Another example of a shift in fixed delay involves the loss 
of a network path, forcing all packets to take a longer route. 
The present invention automatically corrects for negative 
bias rates and shifts (i.e., faster minimum packet arrivals), 
and with a slight modification, can correct for positive fixed 
delay bias rates and shifts also. 

FIG. 12 illustrates a negatively rate-biased packet arrival 
sequence 90. Fixed delay estimator 42 automatically tracks 
negative biases, which resemble "better'' estimates of mini- 
mum delay. Fixed delay estimate 92 stairsteps downward as 
new samples with smaller clock offsets are received. Playout 
delay 94 may be configured to stairstep downwards with 
fixed delay estimate 92. Optionally, and as shown, playout 
delay 94 does not automatically stairstep downwards with 
every step of 92, but relies on its envelope- folio wing char- 
acteristics to track the negative rate -bias in packet arrival 
sequence 90. 

FIG. 13 illustrates a positively rate-biased packet arrival 
sequence 96. The minimum observed packet arrival occurs 
at point 98 in sequence 96. Using the basic fixed delay 
estimator of the present invention, fixed delay would remain 
at the value observed at point 98, as shown by curve 100, for 
the remainder of the conference. Over time, a large offset 
may develop between the true and the estimated fixed delay, 
resulting in unnecessary playout delay, suboptimal variable 
delay estimation, and possible eventual playout buffer 
exhaustion (depending on how the buffer is implemented). 

To combat the positive rate-bias problem, it is preferred 
that a small positive rate bias be introduced artificially into 
the fixed delay estimate. One method of accomplishing an 
artificial bias is to count packets since the last downward 
update to the fixed delay estimate. If the counter reaches a 
set target value, the fixed delay estimate is increased, e.g., by 
one frame. If the data has no actual positive rate-bias, a 
subsequent low-delay packet should quickly re-adjust the 
fixed delay estimate back down and reset the bias counter. 
Fixed delay estimate 102 illustrates how the artificial rate 
bias allows estimator 42 to track a positive rate bias in 
sequence 96. 

In practice, most biases will be unnoticeable over the 
length of a conference. A low artificial bias rate, e.g., 
equivalent to one sample/packet, will generally be more than 
sufficient. If new low-delay packets are not observed after 
adjustment of the fixed delay upwards, the artificial bias rate 
may optionally be increased gradually until a new low-delay 
packet is found. One method of increasing bias rate is to 
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reduce the set target value the counter must reach each time 
an artificial up-adjustment with no preceding down- 
adjustment is made. 

The invention has been described herein with reference to 
several illustrative embodiments. Other modifications to the 
disclosed embodiments will be obvious to those of ordinary 
skill in the art upon reading this disclosure, and are intended 
to fall within the scope of the invention as claimed. For 
example, many possible variations exist for an envelope 
estimator — the present invention teaches that such an esti- 
mator have the capability to decrease playout time in 
response to observed jitter values much lower than the 
current playout delay, and relatively rapidly increase playout 
time in response to observed jitter values of roughly the 
same magnitude or higher than the current playout delay. 
Likewise, other methods of implementing positive -rate-bias 
detection and compensation for fixed delay estimation will 
be immediately obvious to one of ordinary skill upon 
reading this disclosure. The particular playout buffer imple- 
mentation is not critical to the present invention. Numerical 
values disclosed herein are tuning parameters that may be 
adjusted for a given application using the principles taught 
in this disclosure. 

What is claimed is: 

1. A packet-based real-time data receiver comprising: 

a packet transmission fixed delay estimator, said fixed 
delay estimator keeping a fixed delay estimate over the 
duration of a conference connection using said receiver, 
and adjusting said fixed delay estimate downwards 
during said conference in response to the arrival of a 
conference packet prior to a minimum arrival time 
predicted for that packet by said fixed delay estimate; 
and 

a packet transmission variable delay estimator, said vari- 
able delay estimator calculating a variable packet delay 
for each conference packet by subtracting a predicted 
minimum arrival time for each packet, based on said 
fixed delay estimate, from the actual arrival time of 
each packet. 

2. The data receiver of claim 1, further comprising an 
adaptive playout delay estimator that adapts packet playout 
delay for said receiver using variable packet delays from 
said packet transmission variable delay estimator. 

3. The data receiver of claim 2, wherein said adaptive 
playout delay estimator comprises a non-linear packet vari- 
able delay filter. 

4. A packet-based real-time data receiver comprising: 

a playout buffer for queuing packets from a received data 
stream for playout, said packets in said received data 
stream each containing a packet timestamp generated 
by a remote system send clock operating at a send clock 
rate; 

a local timestamp generator operating at approximately 
said send clock rate; 

a packet transmission fixed delay estimator, said fixed 
delay estimator comparing the packet timestamp from 
each received packet to a receive timestamp taken from 
said local timestamp generator at approximately the 
arrival time of said received packet, and adjusting a 
fixed delay estimate downwards in response to the 
arrival of packets prior to an arrival time predicted by 
said fixed delay estimate; 

a packet transmission variable delay estimator, said vari- 
able delay estimator calculating a variable delay for 
each packet by subtracting said fixed delay estimate 
from the difference between the receive timestamp and 
the packet timestamp; and 
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a playout delay estimator that non-linearly adapts a play- 
out delay estimate for received packets based on the 
relative magnitude of the playout delay estimate as 
compared to a variable delay calculated by said packet 
transmission variable delay estimator. 5 

5. The data receiver of claim 4, wherein said local 
timestamp generator is synchronized with said remote sys- 
tem send clock. 

6. The data receiver of claim 4, wherein said packet 
transmission fixed delay estimator, packet transmission vari- 10 
able delay estimator, and playout delay estimator comprise 

a programmed microprocessor. 

7. A method of receiving and playing packetized real-time 
data, said method comprising the steps of: 

establishing a real-time conference over a packet- 15 
switched network; 

for a first real-time data packet received during said 
conference, initializing a packet transmission fixed 
delay estimate and a playout delay estimate for the 
conference; and 20 

for each additional real-time data packet received during 
said conference, 

calculating a packet delay estimate, 

adjusting said fixed delay estimate downwards if said 2j 

fixed delay estimate is greater than said packet delay 

estimate, and 

subtracting said fixed delay estimate from said packet 
delay estimate, thereby obtaining a packet variable 
delay estimate. 30 

8. The method of claim 7, wherein said step of initializing 
a packet transmission fixed delay estimate consists of setting 
said fixed delay estimate to equal the clock offset between a 
local clock reference and a timestamp affixed to said first 
data packet by its sender. 3S 

9. The method of claim 7, wherein said adjusting said 
fixed delay estimate downwards step comprises, for a data 
packet triggering such adjustment, resetting said fixed delay 
estimate to equal the clock ofiset between a local clock 
reference and a timestamp affixed to that data packet by its 4Q 
sender. 

10. The method of claim 7, further comprising as a part of 
said adjusting said fixed delay estimate downwards step, 
adjusting said playout delay estimate upwards by an equiva- 
lent amount. 45 

11. The method of claim 7, further comprising the step of 
introducing an artificial positive rate bias in said fixed delay 
estimate. 

12. The method of claim 7, further comprising the step of 
adapting said playout delay estimate throughout the duration SQ 
of said conference to approximately maintain a preset ratio 
between said playout delay estimate and said packet variable 
delay estimates calculated for said real-time data packets. 
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13. The method of claim 12, wherein said adapting said 
playout delay estimate comprises applying a correction 
formula to said playout delay estimate for each packet 
variable delay estimate, with a correction formula gain 
determined by the ratio of that packet variable delay esti- 
mate to the playout delay estimate. 

14. The method of claim 13, wherein said gain is highest 
for packet variable delay estimates greater than the playout 
delay estimate and less than a preset maximum variable 
delay. 

15. The method of claim 13, comprising applying said 
correction formula by mapping said ratio into one of a 
plurality of ratio zones, each of said zones having a zone- 
specific gain formula. 

16. The method of claim 7, further comprising the step of 
fixing the playout time for data spurts during said conference 
using said playout delay estimate. 

17. The method of claim 7, wherein said step of adjusting 
said fixed delay estimate downwards comprises resetting 
said fixed delay estimate to said packet delay estimate. 

18. A method of receiving and playing packetized real- 
time data, said method comprising the steps of: 

establishing a real-time conference over a packet- 
switched network; 

for a first real-time data packet received during said 
conference, initializing a packet transmission fixed 
delay estimate and a playout delay estimate for the 
conference; 

introducing an artificial positive rate bias in said fixed 

delay estimate; 
for each additional real-time data packet received during 

said conference, 

calculating a packet delay estimate, 

adjusting said fixed delay estimate downwards if said 

fixed delay estimate is greater than said packet delay 

estimate, and 

subtracting said fixed delay estimate from said packet 
delay estimate, thereby obtaining a packet variable 
delay estimate; 

adapting said playout delay estimate throughout the dura- 
tion of said conference to approximately maintain a 
preset ratio between said playout delay estimate and 
said packet variable delay estimates calculated for said 
additional data packets; and 

periodically adjusting playout time for data packets 
received during said conference to include a buffer 
time, measured with reference to the fixed delay 
estimate, corresponding to said playout delay estimate. 

* * * * * 
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